Geospatial data can be expressed in maps, which can offer amazing levels of data density.
You’re probably familiar with JSON, which is frequently used to store and pass data between applications. GeoJSON applies JSON structure to geospatial data in a single JSON file. Let’s get a map of Philly Census Tracts from the City of Philadelphia:
library(rgdal)
philadelphia_map <- readOGR('http://data.phl.opendata.arcgis.com/datasets/8bc0786524a4486bb3cf0f9862ad0fbf_0.geojson')
## OGR data source with driver: GeoJSON
## Source: "http://data.phl.opendata.arcgis.com/datasets/8bc0786524a4486bb3cf0f9862ad0fbf_0.geojson", layer: "OGRGeoJSON"
## with 384 features
## It has 14 fields
We can explore this object using the Environment pane in RStudio, or we can use commands like str (structure) or head (show first few rows) to look into this object.
str(philadelphia_map, max.level = 2)
## Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
## ..@ data :'data.frame': 384 obs. of 14 variables:
## ..@ polygons :List of 384
## ..@ plotOrder : int [1:384] 299 344 46 368 304 253 383 127 188 134 ...
## ..@ bbox : num [1:2, 1:2] -75.3 39.9 -75 40.1
## .. ..- attr(*, "dimnames")=List of 2
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
head(philadelphia_map@data)
| OBJECTID | STATEFP10 | COUNTYFP10 | TRACTCE10 | GEOID10 | NAME10 | NAMELSAD10 | MTFCC10 | FUNCSTAT10 | ALAND10 | AWATER10 | INTPTLAT10 | INTPTLON10 | LOGRECNO | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 42 | 101 | 009400 | 42101009400 | 94 | Census Tract 94 | G5020 | S | 366717 | 0 | +39.9632709 | -075.2322437 | 10429 |
| 1 | 2 | 42 | 101 | 009500 | 42101009500 | 95 | Census Tract 95 | G5020 | S | 319070 | 0 | +39.9658709 | -075.2379140 | 10430 |
| 2 | 3 | 42 | 101 | 009600 | 42101009600 | 96 | Census Tract 96 | G5020 | S | 405273 | 0 | +39.9655396 | -075.2435075 | 10431 |
| 3 | 4 | 42 | 101 | 013800 | 42101013800 | 138 | Census Tract 138 | G5020 | S | 341256 | 0 | +39.9764504 | -075.1771771 | 10468 |
| 4 | 5 | 42 | 101 | 013900 | 42101013900 | 139 | Census Tract 139 | G5020 | S | 562934 | 0 | +39.9750563 | -075.1711846 | 10469 |
| 5 | 6 | 42 | 101 | 014000 | 42101014000 | 140 | Census Tract 140 | G5020 | S | 439802 | 0 | +39.9735358 | -075.1630966 | 10470 |
The important takeaway:
rgdal will ingest geoJSON or Shapefile data and give you back a “Spatial Polygons Data Frame”@data), and four elements which together make up the map (@polygons, @plotOrder, @bbox, @proj4string).We’re going to use leaflet to create a dynamic map that users can interact with. It’s not going to be the prettiest!
library(leaflet)
leaflet(philadelphia_map) %>%
addPolygons()
We can change default colors and add labels that come from the @data part of our map object.
leaflet(philadelphia_map) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = "white",
fillOpacity = 1,
label = philadelphia_map@data$NAMELSAD10
)
At this point, let’s return to the slide deck!
Currently all we have is a boring map of the Census Tracts in Philadelphia. Let’s add some data to enrich this map with our own patient or research information.
For the purposes of this exercise, we’re going to use public data from the City of Philadelphia. See the download page for more details.
library(rgdal)
child_blood_lead <- read.csv("https://phl.carto.com/api/v2/sql?q=SELECT+*+FROM+child_blood_lead_levels_by_ct&filename=child_blood_lead_levels_by_ct&format=csv&skipfields=cartodb_id,the_geom,the_geom_webmercator", stringsAsFactors = FALSE)
Let’s peek at this data:
head(child_blood_lead)
| census_tract | data_redacted | num_bll_5plus | num_screen | perc_5plus |
|---|---|---|---|---|
| 42101000100 | false | 0 | 100 | 0 |
| 42101000200 | true | NA | 109 | NA |
| 42101000300 | true | NA | 110 | NA |
| 42101000401 | true | NA | 61 | NA |
| 42101000402 | false | 0 | 41 | 0 |
| 42101000500 | true | NA | 49 | NA |
Understanding this data:
WARNING
merge will reorder your rows, and that breaks the relationship between the rows of tabular data and the corresponding polygons. Not what you want! So we’re going to keep track of the order thanks to the helpful “OBJECTID” variable, which is super helpful. If you’re working with data that doesn’t have a column like this, just add it, so you can keep track of what the original order was.
Merging in this case is pretty simple – we just have to bring in the lead data and make sure our “hinge” (overlapping field) is set up properly:
merged <- merge(x = philadelphia_map@data,
y = child_blood_lead,
by.x = "GEOID10",
by.y = "census_tract",
all = TRUE)
philadelphia_map@data <- merged[order(merged$OBJECTID),]
Let’s see what our lead levels look like:
library(leaflet.extras)
lead_palette <- colorBin("Blues", domain = philadelphia_map@data$perc_5plus, bins = 10, na.color = "#aaaaaa")
leaflet(philadelphia_map) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
fillOpacity = 1,
label = philadelphia_map@data$NAMELSAD10
) %>%
suspendScroll()
At this point, let’s return to the slide deck!
And here we want to pull in our local file, which is a simplified version of data from the American Community Survey conducted by the Census Bureau. Let’s see what is contains.
economic_data <- read.csv("../Data/philly_census.csv")
head(economic_data)
| census_tract | pct_unemployed | pct_commute_public_transit | median_household_income | mean_household_income | pct_child_uninsured | pct_families_below_poverty_line |
|---|---|---|---|---|---|---|
| 42101000100 | 2.6 | 33.4 | 103772 | 124525 | 33.9 | 1.7 |
| 42101000200 | 7.6 | 22.0 | 50455 | 99337 | 6.5 | 13.7 |
| 42101000300 | 4.6 | 11.5 | 93036 | 121867 | 3.3 | 3.4 |
| 42101000401 | 5.9 | 21.8 | 57604 | 83354 | 0.0 | 23.2 |
| 42101000402 | 2.0 | 14.6 | 70038 | 92625 | 0.0 | 0.0 |
| 42101000500 | 12.3 | 20.6 | 40568 | 60813 | 9.1 | 0.0 |
This is selected economic characteristics of various census tracts. Let’s combine the data here with our map, and use labels to allow people to understand the data better:
merged_again <- merge(x=philadelphia_map@data,
y=economic_data,
by.x = "GEOID10",
by.y = "census_tract",
all = TRUE)
philadelphia_map@data <- merged_again[order(merged_again$OBJECTID),]
head(philadelphia_map@data)
| GEOID10 | OBJECTID | STATEFP10 | COUNTYFP10 | TRACTCE10 | NAME10 | NAMELSAD10 | MTFCC10 | FUNCSTAT10 | ALAND10 | AWATER10 | INTPTLAT10 | INTPTLON10 | LOGRECNO | data_redacted | num_bll_5plus | num_screen | perc_5plus | pct_unemployed | pct_commute_public_transit | median_household_income | mean_household_income | pct_child_uninsured | pct_families_below_poverty_line | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 95 | 42101009400 | 1 | 42 | 101 | 009400 | 94 | Census Tract 94 | G5020 | S | 366717 | 0 | +39.9632709 | -075.2322437 | 10429 | false | 14 | 306 | 4.6 | 19.2 | 49.1 | 18408 | 31068 | 4.6 | 41.5 |
| 96 | 42101009500 | 2 | 42 | 101 | 009500 | 95 | Census Tract 95 | G5020 | S | 319070 | 0 | +39.9658709 | -075.2379140 | 10430 | false | 11 | 253 | 4.3 | 12.3 | 53.2 | 27708 | 33932 | 20.0 | 32.9 |
| 97 | 42101009600 | 3 | 42 | 101 | 009600 | 96 | Census Tract 96 | G5020 | S | 405273 | 0 | +39.9655396 | -075.2435075 | 10431 | false | 14 | 314 | 4.5 | 15.2 | 39.4 | 24402 | 33784 | 1.2 | 39.1 |
| 134 | 42101013800 | 4 | 42 | 101 | 013800 | 138 | Census Tract 138 | G5020 | S | 341256 | 0 | +39.9764504 | -075.1771771 | 10468 | false | 17 | 157 | 10.8 | 15.8 | 19.6 | 28534 | 41352 | 4.8 | 24.2 |
| 135 | 42101013900 | 5 | 42 | 101 | 013900 | 139 | Census Tract 139 | G5020 | S | 562934 | 0 | +39.9750563 | -075.1711846 | 10469 | false | 14 | 264 | 5.3 | 13.4 | 37.8 | 14314 | 30200 | 1.4 | 47.7 |
| 136 | 42101014000 | 6 | 42 | 101 | 014000 | 140 | Census Tract 140 | G5020 | S | 439802 | 0 | +39.9735358 | -075.1630966 | 10470 | false | 7 | 140 | 5.0 | 11.7 | 49.0 | 24474 | 36123 | 3.7 | 33.4 |
Ideally we’d love to be able to have a dual-purpose choropleth that shows both a color-coded blood lead level layer and a color-coded poverty level layer. Let’s do that, using the keyword “group” and some layer controls.
Let’s first create some useful labels that will show when polygons are activated by hover:
labels <- sprintf(
"<strong>%s</strong><br/>
Families Below Poverty Line (%%): %g <br/>
Children With High Blood Lead Levels (%%): %g",
philadelphia_map@data$NAMELSAD10,
philadelphia_map@data$pct_families_below_poverty_line,
philadelphia_map@data$perc_5plus
) %>% lapply(htmltools::HTML)
And now let’s map both poverty and lead data, using two addPolygon layers and some layer selection:
poverty_palette <- colorBin("Reds", domain = philadelphia_map@data$pct_families_below_poverty_line, bins = 10, na.color = "#cccccc")
leaflet(philadelphia_map) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
fillOpacity = 0.5,
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"),
group = "Lead Level"
) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~poverty_palette(philadelphia_map@data$pct_families_below_poverty_line),
fillOpacity = 1,
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"),
group = "Poverty Level"
) %>%
addLayersControl(
baseGroups = c("Lead Level", "Poverty Level"),
options = layersControlOptions(collapsed = FALSE)
) %>%
suspendScroll()
We could also add markers to indicate the most vulnerable areas – where child high blood lead was measured at over 15% and the poverty rate is over 25%:
library(dplyr)
most_vulnerable <- philadelphia_map@data %>%
mutate(lat = as.numeric(as.character(INTPTLAT10)), lng = as.numeric(as.character(INTPTLON10))) %>%
filter(pct_families_below_poverty_line > 25, perc_5plus > 15)
leaflet(philadelphia_map) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
fillOpacity = 0.5,
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"),
group = "Lead Level"
) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~poverty_palette(philadelphia_map@data$pct_families_below_poverty_line),
fillOpacity = 1,
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"),
group = "Poverty Level"
) %>%
addMarkers(lat=most_vulnerable$lat, lng=most_vulnerable$lng) %>%
addLayersControl(
baseGroups = c("Lead Level", "Poverty Level"),
options = layersControlOptions(collapsed = FALSE)
) %>%
suspendScroll()
Want to save this and pass this javascript-powered map to your web developer to put on your website?
library(htmlwidgets)
my_map <- leaflet(philadelphia_map) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
fillOpacity = 0.5,
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"),
group = "Lead Level"
) %>%
addPolygons(
weight = 1, # border thickness
opacity = 0.5, # border opacity
color = "grey", # border color
fillColor = ~poverty_palette(philadelphia_map@data$pct_families_below_poverty_line),
fillOpacity = 1,
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto"),
group = "Poverty Level"
) %>%
addMarkers(lat=most_vulnerable$lat, lng=most_vulnerable$lng) %>%
addLayersControl(
baseGroups = c("Lead Level", "Poverty Level"),
options = layersControlOptions(collapsed = FALSE)
) %>%
suspendScroll()
saveWidget(my_map, file="../Media/my_map.html")